Idea Density – A Potentially Informative Characteristic of Retrieved Documents

نویسنده

  • Michael A. Covington
چکیده

Idea density, or number of propositions divided by number of words, is a well-known psycholinguistic measurement which can now be estimated reliably by software. Preliminary tests indicate that idea density distinguishes between documents about the same subject written for specialist and nonspecialist audiences, and that it does not correlate with lexical diversity or FleschKincaid readability.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

VisIRR: Interactive Visual Information Retrieval and Recommendation for Large-scale Document Data

We present a visual analytics system called VisIRR, which is an interactive visual information retrieval and recommendation system for document discovery. VisIRR effectively combines both paradigms of passive pull through a query processes for retrieval and active push that recommends the items of potential interest based on the user preferences. Equipped with efficient dynamic query interfaces...

متن کامل

CLEF-IP 2010: Prior Art Retrieval Using the Different Sections in Patent Documents

In this paper we describe our participation in the 2010 CLEF-IP Prior Art Retrieval task where we examined the impact of information in different sections of patent documents, namely the title, abstract, claims, description and IPC-R sections, on the retrieval and re-ranking of patent documents. Using a standard bag-of-words approach in Lemur we found that the IPC-R sections are the most inform...

متن کامل

A Scalable Architecture for XML Retrieval

Whereas in classical text collections, documents are considered as atomic units, we consider in XML collections elements in documents. This augmented view increases the number of potentially retrieved objects. A retrieved object can be a document, an element in a document, an aggregation of elements or of documents or the whole collection itself. The increase in the number of objects to be inde...

متن کامل

A Multiple-stage Approach to Re-ranking Clinical Documents

This paper presents our approach to medical information retrieval and experimental results of participating in eHealth Task 3-A of CLEF 2014. The task is to retrieve relevant documents from a medical collection given a query generated from a discharge summary. The key idea of our method is to compute accurate similarity scores via multiple stages of re-ranking documents from initial documents r...

متن کامل

A Clustering Method of Highly Dimensional Patent Data Using Bayesian Approach

Patent data have diversely technological information of any technology field. So, many companies have managed the patent data to build their R&D policy. Patent analysis is an approach to the patent management. Also, patent analysis is an important tool for technology forecasting. Patent clustering is one of the works for patent analysis. In this paper, we propose an efficient clustering method ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008